1,128 research outputs found

    Secondary Indexing in One Dimension: Beyond B-trees and Bitmap Indexes

    Full text link
    Let S be a finite, ordered alphabet, and let x = x_1 x_2 ... x_n be a string over S. A "secondary index" for x answers alphabet range queries of the form: Given a range [a_l,a_r] over S, return the set I_{[a_l;a_r]} = {i |x_i \in [a_l; a_r]}. Secondary indexes are heavily used in relational databases and scientific data analysis. It is well-known that the obvious solution, storing a dictionary for the position set associated with each character, does not always give optimal query time. In this paper we give the first theoretically optimal data structure for the secondary indexing problem. In the I/O model, the amount of data read when answering a query is within a constant factor of the minimum space needed to represent I_{[a_l;a_r]}, assuming that the size of internal memory is (|S| log n)^{delta} blocks, for some constant delta > 0. The space usage of the data structure is O(n log |S|) bits in the worst case, and we further show how to bound the size of the data structure in terms of the 0-th order entropy of x. We show how to support updates achieving various time-space trade-offs. We also consider an approximate version of the basic secondary indexing problem where a query reports a superset of I_{[a_l;a_r]} containing each element not in I_{[a_l;a_r]} with probability at most epsilon, where epsilon > 0 is the false positive probability. For this problem the amount of data that needs to be read by the query algorithm is reduced to O(|I_{[a_l;a_r]}| log(1/epsilon)) bits.Comment: 16 page

    A Review on Detection of Medical Plant Images

    Get PDF
    Both human and non-human life on Earth depends heavily on plants. The natural cycle is most significantly influenced by plants. Because of the sophistication of recent plant discoveries and the computerization of plants, plant identification is particularly challenging in biology and agriculture. There are a variety of reasons why automatic plant classification systems must be put into place, including instruction, resource evaluation, and environmental protection. It is thought that the leaves of medicinal plants are what distinguishes them. It is an interesting goal to identify the species of plant automatically using the photo identity of their leaves because taxonomists are undertrained and biodiversity is quickly vanishing in the current environment. Due to the need for mass production, these plants must be identified immediately. The physical and emotional health of people must be taken into consideration when developing drugs. To important processing of medical herbs is to identify and classify. Since there aren't many specialists in this field, it might be difficult to correctly identify and categorize medicinal plants. Therefore, a fully automated approach is optimal for identifying medicinal plants. The numerous means for categorizing medicinal plants that take into interpretation based on the silhouette and roughness of a plant's leaf are briefly précised in this article

    A Partition Theorem for a Randomly Selected Large Population

    Get PDF
    We state and prove a proposition on partitioning of a randomly selected large population into stationary and non-stationary populations by using a property of stationary population identity. Applicability of this theorem for practical purposes is summarized at the end.Comment: 7 pages, a new result in population dynamic

    Compressing Binary Decision Diagrams

    Full text link
    The paper introduces a new technique for compressing Binary Decision Diagrams in those cases where random access is not required. Using this technique, compression and decompression can be done in linear time in the size of the BDD and compression will in many cases reduce the size of the BDD to 1-2 bits per node. Empirical results for our compression technique are presented, including comparisons with previously introduced techniques, showing that the new technique dominate on all tested instances.Comment: Full (tech-report) version of ECAI 2008 short pape
    corecore